Email Hoax Detection System Using Levenshtein Distance Method

نویسندگان

  • Yoke Yie Chen
  • Suet-Peng Yong
  • Adzlan Ishak
چکیده

Hoaxes are non-malicious viruses. They live on deceiving human’s perception by conveying false claims as truth. Throughout history, hoaxes have actually able to influence a lot of people to the extent of tarnishing the victim’s image and credibility. Moreover, wrong and misleading information has always been a distortion to a human’s growth. Some hoaxes were created in a way that they can even obtain personal data by convincing the victims that those data were required for official purposes. Hoaxes are different from spams in a way that they masquerade themselves through the address of those related either directly or indirectly to us. Most of the time, they appear as a forwarded message and sometimes from legit companies. This paper addresses this issue by developing a hoax detection system by incorporating text matching method using Levenshtein Distance measure. The proposed model is used to identify text-based hoax emails. Sensitivity and specificity are used to evaluate the accuracy of the system in identifying hoax emails.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Epm–rt–2011-01 Levenshtein Edit Distance-based Type Iii Clone Detection Using Metric Trees

This paper presents an original technique for clone detection with metric trees using Levenshtein distance as the metric defined between two code fragments. This approach achieves a faster empirical performance. The resulting clones may be found with varying thresholds allowing type 3 clone detection. Experimental results of metric trees performance as well as clone detection statistics on an o...

متن کامل

Detecting repetitions in spoken dialogue systems using phonetic distances

This paper addresses the problem of automatic detection of repeated turns in Spoken Dialogue Systems. Repetitions can be a symptom of problematic communication between users and systems. Such repetitions are often due to speech recognition errors, which in turn makes it hard to use speech recognition to detect repetitions. We present an approach to detect repetition using the phonetic distance ...

متن کامل

An Intelligent Automatic Hoax Detection System

Although they sometimes seem harmless, hoaxes represent notnegligible threat to individuals’ awareness of real-life situations by deceiving them, and at the same time doing harm to the image of their organizations, which can lead to substantial financial losses. Spreading of hoaxes also influences the normal operating regime of networks and the efficiency of workers. In this paper we present an...

متن کامل

Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data

The Levenshtein dialect distance method has proven to be a successful method for measuring phonetic distances between Dutch dialects. The aim of the present investigation is to validate the Levenshtein dialect distance with perceptual data from a language area other than the Dutch, namely Norway. We calculate the correlation between the Levenshtein distances and the distances between 15 Norwegi...

متن کامل

E-Mail System for Automatic Hoax Recognition

With the advent of Information society false and inaccurate information represents one of the major problems. Hoaxes and unsolicited commercial e-mail messages (SPAM) are an important example of such information. A conceptual solution together with the developed system for automatic hoax recognition is presented. Hoax recognition is done in several parallel steps to increase system accuracy and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JCP

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014